Goto

Collaborating Authors

 da algorithm




Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets

Fauchard, Marylou, Carichon, Florian, Carvalho, Margarida, Farnadi, Golnoosh

arXiv.org Artificial Intelligence

Recent advances in reasoning with large language models (LLMs) have demonstrated strong performance on complex mathematical tasks, including combinatorial optimization. Techniques such as Chain-of-Thought and In-Context Learning have further enhanced this capability, making LLMs both powerful and accessible tools for a wide range of users, including non-experts. However, applying LLMs to matching problems, which require reasoning under preferential and structural constraints, remains underexplored. To address this gap, we introduce a novel benchmark of 369 instances of the College Admission Problem, a canonical example of a matching problem with preferences, to evaluate LLMs across key dimensions: feasibility, stability, and optimality. We employ this benchmark to assess the performance of several open-weight LLMs. Our results first reveal that while LLMs can satisfy certain constraints, they struggle to meet all evaluation criteria consistently. They also show that reasoning LLMs, like QwQ and GPT-oss, significantly outperform traditional models such as Llama, Qwen or Mistral, defined here as models used without any dedicated reasoning mechanisms. Moreover, we observed that LLMs reacted differently to the various prompting strategies tested, which include Chain-of-Thought, In-Context Learning and role-based prompting, with no prompt consistently offering the best performance. Finally, we report the performances from iterative prompting with auto-generated feedback and show that they are not monotonic; they can peak early and then significantly decline in later attempts. Overall, this work offers a new perspective on model reasoning performance and the effectiveness of prompting strategies in combinatorial optimization problems with preferential constraints.


Fast Mixing of Data Augmentation Algorithms: Bayesian Probit, Logit, and Lasso Regression

Lee, Holden, Zhang, Kexin

arXiv.org Machine Learning

Despite the widespread use of the data augmentation (DA) algorithm, the theoretical understanding of its convergence behavior remains incomplete. We prove the first non-asymptotic polynomial upper bounds on mixing times of three important DA algorithms: DA algorithm for Bayesian Probit regression (Albert and Chib, 1993, ProbitDA), Bayesian Logit regression (Polson, Scott, and Windle, 2013, LogitDA), and Bayesian Lasso regression (Park and Casella, 2008, Rajaratnam et al., 2015, LassoDA). Concretely, we demonstrate that with $\eta$-warm start, parameter dimension $d$, and sample size $n$, the ProbitDA and LogitDA require $\mathcal{O}\left(nd\log \left(\frac{\log \eta}{\epsilon}\right)\right)$ steps to obtain samples with at most $\epsilon$ TV error, whereas the LassoDA requires $\mathcal{O}\left(d^2(d\log d +n \log n)^2 \log \left(\frac{\eta}{\epsilon}\right)\right)$ steps. The results are generally applicable to settings with large $n$ and large $d$, including settings with highly imbalanced response data in the Probit and Logit regression. The proofs are based on the Markov chain conductance and isoperimetric inequalities. Assuming that data are independently generated from either a bounded, sub-Gaussian, or log-concave distribution, we improve the guarantees for ProbitDA and LogitDA to $\tilde{\mathcal{O}}(n+d)$ with high probability, and compare it with the best known guarantees of Langevin Monte Carlo and Metropolis Adjusted Langevin Algorithm. We also discuss the mixing times of the three algorithms under feasible initialization.


Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning

Hosseini, Hadi, Roy, Sanjukta, Zhang, Duohan

arXiv.org Artificial Intelligence

Two-sided matching markets describe a large class of problems wherein participants from one side of the market must be matched to those from the other side according to their preferences. In many real-world applications (e.g. content matching or online labor markets), the knowledge about preferences may not be readily available and must be learned, i.e., one side of the market (aka agents) may not know their preferences over the other side (aka arms). Recent research on online settings has focused primarily on welfare optimization aspects (i.e. minimizing the overall regret) while paying little attention to the game-theoretic properties such as the stability of the final matching. In this paper, we exploit the structure of stable solutions to devise algorithms that improve the likelihood of finding stable solutions. We initiate the study of the sample complexity of finding a stable matching, and provide theoretical bounds on the number of samples needed to reach a stable matching with high probability. Finally, our empirical results demonstrate intriguing tradeoffs between stability and optimality of the proposed algorithms, further complementing our theoretical findings.


A Bayesian Data Augmentation Approach for Learning Deep Models

Toan Tran, Trung Pham, Gustavo Carneiro, Lyle Palmer, Ian Reid

Neural Information Processing Systems

Data augmentation is an essential part of the training process applied to deep learning models. The motivation is that a robust training process for deep learning models depends on large annotated datasets, which are expensive to be acquired, stored and processed. Therefore a reasonable alternative is to be able to automatically generate new annotated training samples using a process known as data augmentation. The dominant data augmentation approach in the field assumes that new training samples can be obtained via random geometric or appearance transformations applied to annotated training samples, but this is a strong assumption because it is unclear if this is a reliable generative model for producing new training samples. In this paper, we provide a novel Bayesian formulation to data augmentation, where new annotated training points are treated as missing variables and generated based on the distribution learned from the training set. For learning, we introduce a theoretically sound algorithm -- generalised Monte Carlo expectation maximisation, and demonstrate one possible implementation via an extension of the Generative Adversarial Network (GAN). Classification results on MNIST, CIFAR-10 and CIFAR-100 show the better performance of our proposed method compared to the current dominant data augmentation approach mentioned above -- the results also show that our approach produces better classification results than similar GAN models.


The data augmentation algorithm

Roy, Vivekananda, Khare, Kshitij, Hobert, James P.

arXiv.org Machine Learning

The data augmentation (DA) algorithms are popular Markov chain Monte Carlo (MCMC) algorithms often used for sampling from intractable probability distributions. This review article comprehensively surveys DA MCMC algorithms, highlighting their theoretical foundations, methodological implementations, and diverse applications in frequentist and Bayesian statistics. The article discusses tools for studying the convergence properties of DA algorithms. Furthermore, it contains various strategies for accelerating the speed of convergence of the DA algorithms, different extensions of DA algorithms and outlines promising directions for future research. This paper aims to serve as a resource for researchers and practitioners seeking to leverage data augmentation techniques in MCMC algorithms by providing key insights and synthesizing recent developments.


Uber Stable: Formulating the Rideshare System as a Stable Matching Problem

Acharya, Rhea, Chen, Jessica, Xiao, Helen

arXiv.org Artificial Intelligence

Peer-to-peer ride-sharing platforms like Uber, Lyft, and DiDi have revolutionized the transportation industry and labor market. At its essence, these systems tackle the bipartite matching problem between two populations: riders and drivers. This research paper comprises two main components: an initial literature review of existing ride-sharing platforms and efforts to enhance driver satisfaction, and the development of a novel algorithm implemented through simulation testing to allow us to make our own observations. The core algorithm utilized is the Gale-Shapley deferred acceptance algorithm, applied to a static matching problem over multiple time periods. In this simulation, we construct a preference-aware task assignment model, considering both overall revenue maximization and individual preference satisfaction. Specifically, the algorithm design incorporates factors such as passenger willingness-to-pay, driver preferences, and location attractiveness, with an overarching goal of achieving equitable income distribution for drivers while maintaining overall system efficiency. Through simulation, the paper compares the performance of the proposed algorithm with random matching and closest neighbor algorithms, looking at metrics such as total revenue, revenue per ride, and standard deviation to identify trends and impacts of shifting priorities. Additionally, the DA algorithm is compared to the Boston algorithm, and the paper explores the effect of prioritizing proximity to passengers versus distance from city center. Ultimately, the research underscores the importance of continued exploration in areas such as dynamic pricing models and additional modeling for unconventional driving times to further enhance the findings on the effectiveness and fairness of ride-sharing platforms.


SMORE: Similarity-based Hyperdimensional Domain Adaptation for Multi-Sensor Time Series Classification

Wang, Junyao, Faruque, Mohammad Abdullah Al

arXiv.org Artificial Intelligence

Many real-world applications of the Internet of Things (IoT) employ machine learning (ML) algorithms to analyze time series information collected by interconnected sensors. However, distribution shift, a fundamental challenge in data-driven ML, arises when a model is deployed on a data distribution different from the training data and can substantially degrade model performance. Additionally, increasingly sophisticated deep neural networks (DNNs) are required to capture intricate spatial and temporal dependencies in multi-sensor time series data, often exceeding the capabilities of today's edge devices. In this paper, we propose SMORE, a novel resource-efficient domain adaptation (DA) algorithm for multi-sensor time series classification, leveraging the efficient and parallel operations of hyperdimensional computing. SMORE dynamically customizes test-time models with explicit consideration of the domain context of each sample to mitigate the negative impacts of domain shifts. Our evaluation on a variety of multi-sensor time series classification tasks shows that SMORE achieves on average 1.98% higher accuracy than state-of-the-art (SOTA) DNN-based DA algorithms with 18.81x faster training and 4.63x faster inference.


Improved Bandits in Many-to-one Matching Markets with Incentive Compatibility

Kong, Fang, Li, Shuai

arXiv.org Artificial Intelligence

Two-sided matching markets have been widely studied in the literature due to their rich applications. Since participants are usually uncertain about their preferences, online algorithms have recently been adopted to learn them through iterative interactions. \citet{wang2022bandit} initiate the study of this problem in a many-to-one setting with \textit{responsiveness}. However, their results are far from optimal and lack guarantees of incentive compatibility. An extension of \citet{kong2023player} to this more general setting achieves a near-optimal bound for player-optimal regret. Nevertheless, due to the substantial requirement for collaboration, a single player's deviation could lead to a huge increase in its own cumulative rewards and an $O(T)$ regret for others. In this paper, we aim to enhance the regret bound in many-to-one markets while ensuring incentive compatibility. We first propose the adaptively explore-then-deferred-acceptance (AETDA) algorithm for responsiveness setting and derive an $O(N\min\left\{N,K\right\}C\log T/\Delta^2)$ upper bound for player-optimal stable regret while demonstrating its guarantee of incentive compatibility, where $N$ represents the number of players, $K$ is the number of arms, $T$ denotes the time horizon, $C$ is arms' total capacities and $\Delta$ signifies the minimum preference gap among players. This result is a significant improvement over \citet{wang2022bandit}. And to the best of our knowledge, it constitutes the first player-optimal guarantee in matching markets that offers such robust assurances. We also consider broader \textit{substitutable} preferences, one of the most general conditions to ensure the existence of a stable matching and cover responsiveness. We devise an online DA (ODA) algorithm and establish an $O(NK\log T/\Delta^2)$ player-pessimal stable regret bound for this setting.